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I Ultra-high energy charged particles of unknown origin, which in- 

f*-*) teract in the high atmosphere of the Earth generating large cascades 

£SJ of secondary particles, can be observed from the ground. For many 

decades, they have been a puzzle for particle physicists and astro- 
physicists. They seem to arrive from random directions in the sky, 
although the most energetic ones are supposed to point towards their 
sources with good accuracy. As an attempt to discriminate among 
several possible production scenarios, astrophysicists try to test the 
statistical isotropy of the directions of arrival of these cosmic rays. 
i i At the highest energies however, the observed cosmic rays are very 

Qh rare, and testing the distribution of such small samples of directional 

•^U data on the sphere is non trivial. We address here a nonparametric 

detection problem, making use of a multiscale analysis of the ob- 
servation sample, based on a decomposition of the directional data 
~tT. using a wavelet frame on the sphere, the needlets. The aim is to test 

i i the isotropy, or the equality of the distribution with a known one. 

We explore two particular procedures, a multiple test and a plug- 
in approach. We describe the practical implementation of these two 
procedures and compare them to other methods in the literature. As 
alternatives to isotropy, we consider both very simple toy-models, and 
rV-s more realistic non isotropic models based on Physics-inspired simu- 

|y-\ lations. The Monte Carlo study shows the good performances of the 

multiple test at moderate sample size, together with the robustness 

of its sensitivity with respect to the unknown characteristics of the 

alternative hypothesis. On the 69 most energetic events published 

by the Pierre Auger collaboration, our procedure detects significant 

• • departure from isotropy. The flexibility of this method and the pos- 

^ sibility to modify it to take into account a large variety of extensions 

k>< of the problem make it an interesting option for future investigation 

5_j of the origin of ultra-high energy cosmic rays. 

1. Introduction. 

1.1. Motivations. It is a common problem in astrophysics to analyse data 
sets containing measurements of a number of objects (such as galaxies of a 
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2 G. FAY ET AL. 

particular type) or of events (such as cosmic rays or gamma ray bursts) 
distributed on the celestial sphere. Each set of such objects or events can be 
represented as a collection of positions Xi = (6>j, 0j), i = 1, . . . , n inS the unit 
sphere of R 3 . In many cases, such objects trace an underlying probability 
distribution / on the sphere, which itself depends on the physics which 
governs the production of the objects and events. Galaxies, for instance, 
form in over-densities of a preexisting smooth field of distribution of matter 
in the universe, and the study of the statistics of their distribution has grown 
into a field of astrophysics by itself (Martinez and Saar, 2002). 

The case of ultra high energy cosmic rays (UHECRs) is of particular 
interest, and is the main focus of the present work. UHECRs are particles of 
unknown origin which arrive at the Earth from apparently random directions 
of the sky. These particles interact with atoms of the upper atmosphere, 
generating a huge cascade of billions of secondary particles. The observation 
of these secondary particles with appropriate detectors on ground permits 
the measurement of the direction of arrival and of the energy of the original 
cosmic ray. 

The existence of cosmic rays has been known for about a century. Such 
particles exist with a very wide range of kinetic energies, from few eV to 
more than 10 20 eV. 1 Observed cosmic rays are typically ordinary charged 
particles (electrons, protons and nuclei), propagating in empty space, and 
deflected by galactic magnetic fields. The rate of observed cosmic rays in 
the vicinity of the Earth, however, decreases rapidly with energy. At low 
energy, the observed cosmic rays are numerous and their composition is well 
known. There also exist several known astrophysical processes responsible 
for their acceleration, such as stellar winds for the least energetic ones, to 
violent phenomena such as supernovae shock waves at higher energy. At the 
highest energies (E > 10 20 eV), however, the observed flux is of the order 
of 1 event per square kilometre per century, which limits the statistics of 
observed events to few tens of events (in two decades of observations). In 
addition, no understood astrophysical process, involving known objects, can 
accelerate particles to such tremendous energies. 

Recent observations of ultrahigh-energy cosmic rays suggest that they are 
ordinary particles, such as protons and nuclei, accelerated in extremely vio- 
lent astrophysical phenomena (see ?, for a recent review on the astrophysics 
of UHECRs). However, many alternate hypotheses concerning their nature 
and origin have been proposed over the years (see, e.g., Hillas, 1984; Torres 
and Anchordoqui, 2004; ?). UHECRs could originate from active galactic 
nuclei (AGN), or from neutron stars surrounded by extremely high mag- 
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TESTING THE ISOTROPY OF HIGH ENERGY COSMIC RAYS 3 

netic fields, or yet from many other processes. It is also possible that the 
type and origin of ultra high energy cosmic rays (at energies above 10 19 eV) 
depend, at least to some extent, upon the energy at which they are observed. 
Indeed, the most energetic cosmic rays cannot propagate very far (i.e. not 
much more than ~ 100 Mpc), without loosing most of their energy by inter- 
actions with photons from the Cosmic Microwave Background (the so-called 
GZK effect; Greisen, 1966; Zatsepin and Kuz'min, 1966). The confirmation 
of the energy cut-off at the high end of the cosmic ray spectrum is one of 
the main achievements of the Pierre Auger Observatory (Abraham et al., 
2008; ?). 

Before the location and physical process of acceleration have been clearly 
identified, taking into account the fact that most of the evidence about 
the chemical composition of cosmic rays at the highest energies rely on 
extrapolations of the present knowledge of hadronic interactions at energies 
two orders of magnitude above the range presently tested at the LHC, it 
is difficult to completely rule-out alternate theoretical explanations as to 
what UHECRs exactly are and what is their origin. Alternate hypotheses 
such as production by decay of long-lived relic particles from the Big Bang, 
about 13 billion years old (Bhattacharjee and Sigl, 2000), are just starting to 
be disfavored by the observations of the Auger collaboration, with recently 
published results about primary photon limits that impose stringent limits 
on these kind of models (?). 

In an attempt to better understand the origin of such UHECRs, physi- 
cists study the statistical distribution of their directions of arrival, looking 
for two particular signatures. First, the (statistically significant) arrival of 
more than one UHECR from the same direction on the sky would indicate 
that their production is not likely to originate from single time events (e.g. 
catastrophic mergers of two compact astrophysical objects), but rather from 
sources which emit UHECRs regularly 2 Second, one may look for correlation 
in the directions of arrival of UHECRs with known astrophysical objects, as 
nearby active galactic nuclei, in an attempt to identify plausible production 
sites. Hence, in some hypotheses, the underlying probability distribution for 
the directions of incidences of observed UHECRs would be a finite sum of 
point-like sources - or nearly point-like, taking into account the deflection 
of the cosmic rays by magnetic fields. In other hypotheses, the distribution 
could be uniform, or smooth and correlated with the local distribution of 
matter in the universe. The distribution could also be a superposition of the 



2 With the caveat that the time of propagation may depend on the energy and on the 
exact trajectory followed by the UHECR to reach us, making it possible that two particles 
reaching the Earth at different times have actually been emitted simultaneously. 
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4 G. FAY ET AL. 

above. Distinguishing between these hypotheses is of primordial importance 
for understanding the origin and mechanism of production of UHECRs. 

In the past 20 years, a number of experiments have gathered observa- 
tions of UHECRs, and several papers have been written which look for such 
features in the distribution of their directions of arrival, with sometimes con- 
tradictory conclusions. The difficulty lies in the fact that UHECRs are rare, 
and that they do not arrive necessarily exactly from the direction where their 
source is located. Indeed, as typical cosmic ray particles are charged (which 
permits their acceleration by electromagnetic processes), they are deflected 
by Galactic and intergalactic magnetic fields. The deflection depends on the 
length of the path through the magnetic field, and on the energy and charge 
of the particle. In fact, only very energetic cosmic rays (above few 10 19 eV) 
with small charge (e.g. protons or nuclei with small atomic numbers) are ex- 
pected to travel typical astrophysical distances from their source to us with 
deflection angles smaller than a few degrees. Details of the deflections are 
not known, as neither the exact magnitude, orientation, and regularity on 
large scales of Galactic and extragalactic magnetic fields, nor the distance 
of the sources of UHECRs, nor the exact energy of the incoming cosmic ray, 
nor its charge (to within a factor of 26 between protons and iron nuclei), 
are known. Errors on the direction of the source of an UHECR can then be 
of order 1° at the lowest (typical error on the measurement of the direction 
of arrival with Auger), up to few degrees for protons, or tens of degrees for 
heavy nuclei travelling a long path through a regular galactic magnetic field. 

Given a set of observed UHECRs, how can one best test for "repeaters" 
(cosmic rays coming from the same source) or more generally anisotropy in 
the distribution? If one restricts the analysis to the few events for which one 
is sure that the deflection angle is negligible, events are scarce and there 
is not enough statistics to conclude. As one selects events with less energy, 
the direction of origin becomes less reliable, with the total number of events 
completely dominated by those events with poorly constrained direction of 
origin. Finally, it is not clear how to build the isotropy test, without any 
sound prior knowledge about the uncertainty in the measured direction of 
the source. All of these are very meaningful questions to analyse UHECR 
observations. 

Recently, an analysis of the direction of arrival of 27 UHECRs observed 
by the Pierre Auger experiment concludes in the existence of an anisotropy, 
and a correlation with objects in a catalogue of nearby active galactic nuclei 
(AGNs), located at distances lower than about 70 Mpc 3 (Abraham et al., 
2008). This anisotropy, however, is less obvious in a more recent analysis, 
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TESTING THE ISOTROPY OF HIGH ENERGY COSMIC RAYS 5 

based on 69 observed events (The Pierre AUGER Collaboration et al., 2010). 
Clearly, the statistics is limited, and the development of new methods for in- 
vestigating this topic can provide new insights on the origin of the UHECRs. 
Methods independent of external data sets such as the fore-mentioned VCV 
catalogue (which is not a statistically well-characterized sample of AGNs 
but a compilation of published results) are of particular interest. 

1.2. Outline of this work. This work focuses on the important question 
of the isotropy of the cosmic rays. Because of the small number of avail- 
able data, this question is not answered yet, although data from the Pierre 
Auger collaboration seems to hint at a correlation between the directions 
to the ultra-high energetic events (above 5.5 x 10 19 eV) and the directions 
to active galactic nuclei in the catalogue compiled by Veron & Cetty-Veron 
(see The Pierre AUGER Collaboration et al., 2008, 2010). From a statistical 
point of view, we address the question of testing the goodness-of-fit of the 
isotropy assumption to this small sample of directional data. The framework 
we choose is purely nonparametric as we do not want to favour any particu- 
lar alternative hypothesis, and as we wish to be able to discover unexpected 
forms of anisotropy. 

The paper is organized as follows. In Section 2, we present a simplified 
model of cosmic ray propagation which will be used in Monte Carlo simula- 
tions to test the method. In Section 3 we present the nonparametric frame- 
work. Then we describe our needlet based anisotropy tests in Section 4. In 
section 5, we present a Monte Carlo experiment that compares the power of 
the different test in terms of sensitivity and robustness with respect to the 
parameters of the methods. We apply our procedures to real data from the 
Pierre Auger collaboration in Section 6. We then conclude and give perspec- 
tives for future extensions of the present work. Appendix A in the on-line 
supplemennt is devoted to a short description of the type of wavelets we 
have used (the needlets) and the practical and numerical implementation of 
our methods. 

2. Simulating cosmic ray emission. In our investigation of tools to 
analyse the distribution of UHECR events, we need a way to simulate a dis- 
tribution of observed events, as a function of an underlying physical model. 
A complete Monte Carlo simulation of the physical processes of cosmic ray 
emission and propagation in the magnetic fields is beyond the scope of this 
paper, and too dependent on a number of physical assumptions for which 
there is little available knowledge. We decide to perform qualitatively rel- 
evant simulations using a simple, although physically representative, toy 
model of cosmic ray emission and propagation. 
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6 G. FAY ET AL. 

2.1. Cosmic ray sources. In one hypothesis (Ho), we will assume that 
cosmic rays are emitted from a uniform distribution of many sources, i.e. 
their directions of arrival are independent of the energy, and uniformly dis- 
tributed on the celestial sphere. In the alternate hypothesis (Hi), we will 
assume that n cosmic rays originate from a small number n s of sources, 
distributed uniformly in a spherical volume V of universe, of radius r max = 
70 Mpc. For n s 3> n, the distribution of directions of origin will be close to 
uniform, and (-Hi) indistinguishable from (Hq). For n 3> n s , and n s small, 
coincidences in the directions of arrival of the observed UHECRs will permit 
to identify easily the directions of the emitting sources. Our objective is to 
address the issue when n s is comparable to the number of observed events 
n. 

Simulations are performed as follows: 

• We fix the number n s of sources and distribute them uniformly in the 
volume V. We assume that all sources are physically identical, i.e. they 
emit cosmic rays with the same probability, and the same distribution 
in energy, the latter coinciding with the observed flux dN/dE. 

• We fix the number n of observed cosmic rays, and draw at random their 
energies according to the distribution n(E) oc E~ a , E £ [E m { n , E max ], 
a > 0. 

• For each observed cosmic ray, we assign at random a corresponding 
emitting source, according to a probability density inversely propor- 
tional to the square of the distance D to the source (sources nearer pro- 
duce a larger flux on Earth). This probability distribution can be mod- 
ulated by the acceptance of the instrument for studying realistic test 
cases. For instance, The Pierre AUGER Collaboration et al. (2010) use 
69 highest energy events for the search of correlations with astrophys- 
ical sources, selected by a cut in zenith angle of arrival (# zen ith < 60°). 
Assumed homogeneous time coverage in UT over the years of observa- 
tion, the coverage is computed straightforwardly from simple geomet- 
rical considerations (see ?, and the details at the end of Section 2.2). 
The map of Auger coverage computed in this way is displayed in fig- 
ure 1. The effect of the GZK cutoff is taken into account simply by 
limiting the volume to a sphere of 70 Mpc radius. 

• For each cosmic ray, we modify the direction of arrival due to extra- 
galactic magnetic fields. The next subsection describes the model used 
to implement these deflections. 

2.2. Deflection by Galactic and extragalactic magnetic fields. Galactic 
magnetic fields are an important component of the Galactic interstellar 
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Fig 1: Coverage function g for the Pierre- Auger observatory in Galactic 
coordinates, represented through a Mollweide projection and computed from 
geometrical considerations (see ?). Moreover, g has been rescaled to be a 
probability density function on the sphere. 



medium (ISM). They can be probed in a variety of ways. The impact of 
local magnetic fields is observed in the optical wavelength range via starlight 
polarization, as elongated interstellar dust grains in the foreground of the 
observed star, aligned perpendicularly to magnetic field lines, absorb prefer- 
entially one direction of starlight polarisation. Measurements of many stars 
reveal a general picture of the magnetic field in the Milky Way near the Sun 
(??). Aligned dust grains also emit polarized infrared emission, which can 
be used to to infer magnetic fields in dust clouds (?). Zeeman splitting of 
radio spectral lines allows for the direct measurement of relatively strong 
fields in nearby, dense gas clouds in the Milky Way (?). On larger-scales, 
the magnetic field of our Galaxy can be probed in three dimensions using 
Faraday rotation of pulsar signals (?). Finally, synchrotron emission, emit- 
ted by relativistic electrons spiralling in the magnetic field, can be used to 
constrain the direction and amplitude of the magnetic field either from direct 
observation of the synchrotron polarisation (?), or by measuring the Fara- 
day rotation of Galactic synchrotron using multi-wavelength observations in 
the radio range (below few GHz) (?). 

In the vicinity of the Sun, the Galactic magnetic field has a typical am- 
plitude of few microGauss. This amplitude is typically increasing with de- 
creasing distance towards the Galactic center, where it can reach values of 
few tens of microGauss, and up to a few milliGauss in very local regions. 
In general the regular component over most of the outer Galaxy is of order 
few microGauss, aligned along the Galactic plane. The overall field struc- 
ture follows the optical spiral arms, with evidence for at least one large-scale 
field reversal in the disk, inside the solar radius, and several distortions near 
star-forming regions. 

For the purpose of estimating their impact on the deflection of high energy 
cosmic rays, Galactic magnetic fields are typically modeled as the sum of 
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8 G. FAY ET AL. 

two components with different physical properties, a regular component, and 
a turbulent component. The regular component roughly follows the spiral 
arms of the Galaxy, and induces deflections typically perpendicular to the 
Galactic plane, i.e. deflections in latitude of arrival. The turbulent compo- 
nent induces random deflections, which can be modeled as two-dimensional 
Gaussian distributions centered at the source. Indeed, we assume that such 
deflections are made of the superposition of many independent small deflec- 
tions by independent regions with independent magnetic field directions, so 
that the Gaussian hypothesis is justified by the central limit theorem. We 
consider only cases in which the total deflection is small enough that the 
projection to the sphere is irrelevant (as well as the truncation of angles to 
2ir). Typical deflections for atomic nuclei are as follows (Harari et al., 2002). 
For the regular component (magnetic lensing effect): 



1 20 eV \ /_B 
E/Z ) \2fiGj V3kpc 



5 reg = 3.25° ( -1—L )[ — )[^J—) (1) 



where E is the energy of the UHECR in eV, Z is the atomic number (e.g. 1 
for hydrogen nuclei (protons), 2 for Helium nuclei (alpha articles), etc.), B 
is the magnetic field in micro Gauss {(jlG), and r the propagation length of 
the cosmic ray in the magnetic field. The deflection is assumed deterministic 
(although energy-dependent), and the instantaneous direction of the deflec- 
tion is along v x B, where v is the velocity of the incoming particle and B 
the regular Galactic magnetic field, assumed to be along the y-axis of the 
Galactic coordinate system. 

For the turbulent component (random deflection): 



Jturb 



0.56° 




The deflection is Gaussian distributed with a standard deviation <5 tur b, and 
uniform distribution of the direction of the deviation in [0, 2tt[. The deflec- 
tions are written in terms of the typical expected values for the magnetic 
field (2 /iG for the regular part and 4 //G for the turbulent part), for coher- 
ence length Lg a i of the turbulent part of the Galactic magnetic field (about 
50 pc). 3 kpc is the typical propagation length r inside the Galactic mag- 
netic field for a cosmic ray coming perpendicularly to the Galaxy. A plane 
parallel approximation of the disc-shaped geometry of the Milky Way sug- 
gests a dependence of r with the Galactic latitude b of the incoming cosmic 
ray. We assume here a dependence r oc l/sin6, with a maximum length of 
10 kpc, typical of the size of the Galactic disk. 
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TESTING THE ISOTROPY OF HIGH ENERGY COSMIC RAYS 9 

Extragalactic magnetic fields also deflect cosmic rays originating from 
distant locations in the Universe. These deflections are expected to be qual- 
itatively similar to those due to the turbulent part of the Galactic magnetic 
field, except that typical field strengths are smaller (and less well known) 
and correlation lengths are larger. Following The Pierre Auger collaboration 
(The Pierre AUGER Collaboration et al., 2008), we assume a deflection with 
standard deviation given by: 



/ 10 20cvx ( B \ D /L cxt 

6 ^~ 2A \~EjZ~) Vl^JVToOMp^V^ (3) 

UHECRs are observed to arrive on Earth with a flux dN/dE proportional 
to £^ 42 for energies E > 4x 10 19 eV (Abraham et al., 2008). Although the 
shape of the spectrum is not very well constrained in this region (more recent 
Auger results suggest a spectral index closer to —4.3), the exact shape of 
the spectrum does not have a strong impact on the validity of our analysis. 
Our simulations will assume such a distribution, with various values for the 
minimum energy E m i n , and E ma _ x = 10 21 eV. We focus on very energetic 
UHECRs (E > 10 19 eV), and assume UHECRs are light nuclei (Z « 1), for 
which deflections by magnetic fields are expected to be of the order of a few 
degrees. 

We then implement cosmic ray deflections according to equation (3) (first 
the cosmic ray first travels in the intergalactic medium), and then using 
both equations (1) and (2). As the exact nature of the cosmic rays has little 
impact on the general principles of our method, except that a change in 
atomic number induces a change in the scale of the deflections, we have 
assumed here for simplicity that all cosmic rays are protons (i.e. Z = 1). 
This however, as a further refinement, can be easily changed for practical 
application on real data sets. In particular, the presence or lack of anisotropy 
in the directions of arrival of the highest energy cosmic rays may help shed 
light on the nature of these particles, as iron nuclei, for instance, are more 
deflected by magnetic fields than protons, by a factor Z[ TOQ = 26. This is an 
important point to take into account in view of recent Auger results that 
seem to indicate a low proton fraction at energy above 10 18 eV, so that the 
cosmic rays at those energies might be essentially heavier nuclei (?). 

Figure 2 illustrates simulated outcomes in two extreme cases: few sources 
and many cosmic rays (right) and many sources and few cosmic rays (left). 

In practice, instruments observe the sky unevenly. The capability of the 
instrument to observe in a particular direction of the sky depends on the 
field of view of the instrument, and on the orientation of the instrument 
with respect to the sky (which itself depends on the sidereal time). From 
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Fig 2: Two simulations of the physical model described in Section 2, with 
a = 4.2, E min = 4 x 10 19 ,£' max = 10 21 . On the left, the number of sources 
is n s = 1000 and the number of observations is n = 100. On the right, 
n s = 100 and n = 1000. It appears in this latter case that clusters of events 
are of different typical angular size. 



the properties of the instrument and the geometry of the observations, one 
can infer an equivalent observing time as a function of direction on the sky, 
i.e. a function on the sphere that modulates the probability of detection 
of the observed cosmic rays. As an illustration, we have displayed on Fig- 
ure 1 a Mollweide projection of the coverage map associated with the Pierre 
Auger observatory, in Galactic coordinates, computed following section 2 in 
?. This coverage map has been generated assuming a maximum accepted 
zenith angle for incoming cosmic rays of Zenith = 60, and uniform distri- 
bution of observation periods in universal time (and hence, a coverage that 
depends exclusively of the declination, not the right ascension, in equatorial 
coordinates). The effect of the precession of equinoxes has been neglected 
for generating this coverage map (the perturbations it would generate are 
very tiny as compared to what we can measure with about 100 events, as 
currently available). 

3. Nonparametric tests on the sphere. 

3.1. Introduction. Suppose that (X\, . . . , X n ) is an n sample of i.i.d ran- 
dom positions on the two-dimensional sphere with probability density func- 
tion /. Let /o = 1/47T denote the uniform density on S. Our aim, as explained 
in the previous section is to test 



(Ho) :f = fo 



against 



(Hi) : / # /o ■ 



(71) 



One can take into consideration the non-uniform angular acceptance in the 
observation model by considering some known and arbitrary function g : 
§ —7- [0, 1]. Incoming events from direction £ G § have probability g(£) to be 
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TESTING THE ISOTROPY OF HIGH ENERGY COSMIC RAYS 11 

observed by the instrument. In this case the observed directions X±, . . . , X n 
are distributed along a density which is proportional to g under the null 
hypothesis (isotropy of the original distribution). Let divide g by Lg(£) d£ 
so that g is now normalized to be a density probability function on the 
sphere. In order to test for isotropy of the physical phenomenon in this 
context, we need to implement the test 

(H ):f = g against (Hi) : / # g , (T 2 ) 

On the real line, testing for / = g can be reformulated as testing for the 
uniform distribution of the sample G(X\), . . . , G(X n ) on [0,1] where G is the 
distribution function associated with the probability density g. For higher 
dimensions (as on the sphere), there is no natural transformation of the 
data, no notion of distribution function for directional data, that allows to 
recast 7i as 71 . Then we consider 7i in its generality, with 71 as a particular 
case. 

Our aim in this paper is to provide test algorithms which are at the same 
time easy to implement, efficient in practical situations where the sample 
size is small (a few tens) and the data may be collected in a non uniform nor 
complete way, but also with properties that are likely to be optimal from a 
theoretical point of view. 

Let us begin with a short review on nonparametric tests associated to 
function estimation, since this will inspire our study in many ways. 

3.2. Anisotropy tests among general nonparametric tests. The test prob- 
lem is well posed when the alternative is given. More often in practice it is 
wiser to consider a large nonparametric class of alternatives. To allow deriva- 
tion of optimality properties, following standard point of view in a nonpara- 
metric framework (see for instance Ingster and Suslina, 2003; Ingster, 1993; 
Butucea and Tribouley, 2006), we shall consider smooth alternatives of the 
form 

(H ltn ):f€? n (d,C) (4) 

where 

T n {d,C) = {he1l:d(h,g)>Cr n } (5) 

and 1Z is a class of regularity, that contains for example all the twice contin- 
uously differentiable densities, or densities satisfying the Holder condition 
with Holder exponent s > 0. We may consider balls in Sobolev or Besov 
spaces (see below). Here, d is a (semi-) distance between densities and r n 
is referred to as a separation rate. Roughly speaking, d and r n respectively 
define the shape and the size of the neighbourhood of the density under the 
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12 G. FAY ET AL. 

null which is excluded from the alternative set of densities. The multiplica- 
tive constant C allows to define the concept of critical separation rate; see 
Eqs. (8) and (9) below. 

The choice of such alternatives is essential for the test procedure because 
the test statistics are built, more or less, on estimators of d(f,g). For some 
particular distances, nonpar ametric estimators / of the density of the ob- 
served sample may be plugged into the distance, namely 

d(f,g) = d(f,g). 

For instance, / could be an histogram-like (pixel-wise constant) density esti- 
mate of / based on counting events falling in any pixel of a given tessellation 
{Vk}k=i,...,K of the sphere, namely 

K i 



f=-^T#{X i £V k ,i = l,...,n}- [ - 



n^" L •-""■ ' ' V(V fc ) 

and the decision could be taken on the value of d(f,g) = \\f — g\\ 2 , say. 
Nevertheless, as described in Ingster (2000), such "plug-in" procedures are 
not always optimal in terms of rates of separation (see Section 4.2 for a more 
precise statement). In contrast, multiple tests have nice theoretical (mini- 
max optimality and adaptivity) properties in various contexts: detection in 
a white noise model (Spokoiny, 1996), x 2 test of uniformity on [0,1] (Ingster, 
2000), goodness-of-fit test and model selection for random variables on the 
real line (Fromont and Laurent, 2006), two-sample homogeneity tests (Bu- 
tucea and Tribouley, 2006), for instance. Note that one would also like to test 
for uniformity by taking into account the uncertainty on the measurements 
of the directional data. In a first approximation, this error can be modeled 
as a convolution noise: the observations are Z% = CiXi,i = 1, ... ,n where 
ei,...,e n is an i.i.d. sequence of random rotations in SO(3). ? addressed 
the problem of testing for the isotropy (X\, . . . ,X n ) in the particular case 
of uniform coverage of the sphere and from noisy observation (random ro- 
tations of the directions). As a consequence of the uniform coverage, their 
adaptive testing procedure is ideally constructed on the multipole moments 
of the observations. 

If one has strong prior information, it is possible to construct tests that 
are not uniform except along a few set of directions, but which can have as 
much power as possible at the n~ 1 ' 2 scale in those few directions of interest. 
This framework is introduced in ? and applied in ? for detecting periodicity 
in a sequence of photon arrival times. We insist on the fact that our setting 
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is different as we do not assume any a priori knowledge on the alternative 
hypothesis. 

In the following paragraphs we discuss the various ingredients of our study. 

3.2.1. Distances. We will consider standard distances of functions on 
the sphere, although there is in fact no clear choice for a 'good' distance in 
this framework : L 1 distance is generally more appropriate for probability 
densities, but LP distances when p is increasing and especially L°° are more 
and more sensitive to bumps. As it is both usual and practical, we will 
mainly consider the L 2 distance (with respect to the invariant measure on the 
sphere). But, we will also consider to express our results other LP distances 
such as LP and L°° . It is important to notice that it is the remarkable ability 
to concentrate of the needlets that enables us to consider various distances. 
More traditional bases would only allow the L? distance and would then be 
much less sensitive to local changes. 

3.2.2. Separation rate. Let T(Xi, . . . , X n ) £ {0, 1} be a non randomized 
decision, i.e. a measurable function of the sample (X±, . . . , X n ) with value 
in {0, 1}. The dependence in n is omitted in most of our notations. As usual 
the event [T = 1] is equivalent to the rejection of the null hypothesis. The 
probability of error of the first kind (false positive) of the decision is denoted 

a n (T) = F g (T = 1) (6) 

while probability of error of the second kind (false negative) against the 
alternative (4) is 

/3 n (T,C)= sup P / (T = 0). (7) 

feT n (d,c) 

Here Ft , ¥ g denote the probability measure under the density / or g for the 
i.i.d sample (Xi, . . . ,X n ). 

Formally the separation rate is defined using the following minimax op- 
timality criterion. A sequence r n is a minimax rate of testing (see Ingster, 
2000) if the following statements are satisfied: 

1. For any r' n such that r' n /r n — > as n — > oo, 

lim inf inf {a n {T) + (3 n (T,l)} = 1 (8) 

n— >oo T 

where the infimum is taken on all decision rules, i.e. {0, 1}- valued 
measurable functions of the sample (X\, . . . , X n ). 
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2. For any a,f3 > 0, there exist some constant C > and a test statistic 
T* (said rate optimal in the minimax sense), such that 

lim sup a n (T*) < a and limsup (3 n (T* , C) < (3 (9) 

Condition (8) says that if the separation rate vanishes faster that r n , then 
no test can do better than the blind random decision, for which the sum of 
the errors of the two kinds is exactly 1. Condition (9) says that there exist a 
decision that is efficient for such a separation rate, so that this rate is indeed 
a critical rate. 

It is clear that a good test become sensitive to closer and closer alternative 
hypothesis (H\ n ) when the sample size n grows. The critical radius gives a 
precise and quantitative for this behavior. The rate r n = \j^/n is the usual 
rate in the regular parametric setting. 

3.2.3. Invariance properties. As the uniform distribution is invariant un- 
der rotations of the sphere, the theory of invariant tests (see Lehmann and 
Romano, 2005, chapter 6) leads to impose the same kind of invariance on 
any statistical procedure for deciding whether / = /o or not (see for instance 
Gine (1975) and the references therein). As bases of invariant subspaces un- 
der rotations, the spherical harmonics are thus the most natural tools to 
detect some deviation from isotropy as in problem (7i). However, as ex- 
plained earlier, a common property of astrophysical observation of (point or 
continuous) processes on the sphere is the non-uniform coverage of the sky 
by the instrument. It is common also that some parts of the data are missing 
or so noisy that it is preferable to completely ignore or mask them. That 
is why non invariant approaches must be considered, and localized analysis 
functions (such as wavelets) may be used as alternatives to spherical har- 
monics. In the same spirit, wavelets has been proposed in the context of the 
angular power spectrum estimation by Baldi et al. (2008) and used in the 
realistic case of a partially observed stationary process with heteroscedastic 
noise in Fay et al. (2008) and Fay and Guilloux (2011) . 

3.2.4. Regularity conditions: Besov spaces on the sphere. Although this 
is not directly the purpose of this paper, it is a natural question to ask 
which kind of regularity spaces our procedures are designed for. The prob- 
lem of choosing appropriate spaces of regularity on the sphere is a serious 
question, and it is important to consider the spaces which generalize usual 
approximation properties. On the other hand, we are interested in spaces 
of functions which can be characterised by their needlet coefficients {Pjk} 
associated to a needlet frame {ipjk} (where j denote the scale and k the 
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position; see Appendix A in the online supplement for the precise defini- 
tions). Hence, as is standard in the nonparametric literature, it is natural 
to consider Besov bodies constructed on the needlet basis. In many situa- 
tion (not only the sphere) it can be proved that these spaces can also be 
described as approximation spaces, so they have a genuine meaning and can 
be compared to Sobolev spaces. We define here the Besov body Bp q as the 
space of functions / = (47r) _1 j s fd/J, + ^2j> Yjk&K. Pj,k4>3,k such that 



E 2JS9 (E(i^m^iw p ) 



q/p 

< oo 



j>o fc&c, 



(with the obvious modifications for the cases p or q = oo). Details on Besov 
spaces and their characterization by wavelets can be found in Triebel (1992) 
and Meyer (1992). For details on the relations between needlets and Besov 
spaces we refer for instance to Narcowich et al. (2006a, b), Petrushev and 
Xu (2008) and Kerkyacharian and Picard (2009). 

4. Needlet based test procedure and other anisotropy tests. We 

introduce here two anisotropy detection procedures based on the needlet 
analysis of {Aj}j = i. iijn . The first one is based on multiple testing and will 
be referred to as Multiple. The second one uses an estimate of the density 
plugged in a distance criterion, and will be referred to as PlugIn. For sake 
of further comparison (see Section 5), we also describe two existing methods 
that are used in the gamma ray burst and cosmic ray literature. The first 
one is based on a nearest neighbour analysis (see Quashnock and Lamb, 
1993; ?; Efron and Petrosian, 1995). The second one relies on the two-point 
correlation (see e.g. Kachelriess and Semikoz, 2006). 

We want detection procedures that are efficient from a L 2 point of view, 
but also for other LP norms. In addition, we will require procedures that 
are simple to implement as well as adaptive to unknown and inhomoge- 
neous smoothness. In Euclidean frameworks, these types of requirements 
are well-known to be efficiently handled by (non-linear) wavelet threshold- 
ing estimation in the context of density estimation (see for instance ?), or 
by multiple tests (Ingster, 2000; Spokoiny, 1996). 

Our problem here requires a special construction adapted to the sphere, 
since usual tensorized wavelets will never reflect the manifold structure of the 
sphere and will necessarily create unwanted artifacts. Recently a tight frame 
(i.e. a redundant family sharing some properties with orthonormal bases), 
called needlet frame, was produced which enjoys enough properties to be 
successfully used for density estimation (Baldi et al., 2009) e.g. concentration 
in the "Fourier" domain as well as in the space domain. Here, obviously the 
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"space" domain is the two-dimensional sphere itself whereas the Fourier 
domain is now obtained by replacing the "Fourier" basis by the basis of 
Spherical Harmonics which leads as mentioned in the previous section to 
invariant tests. This construction produces a family of functions {ipjk>j > 
0, k & fCj} which very much resemble wavelets. The index k defines (with 
an analogy to the standard wavelets) the locations (points) on the sphere 
around which the needlet is concentrated, and j is referred to as the scale. 
These needlets have been shown to be extremely useful for solving several 
type of astrophysical problems (Delabrouille et al., 2009; Fay et al., 2008; 
Pietrobon et al., 2006; Marinucci et al., 2008; Pietrobon et al., 2008; Rudjord 
et al., 2009) or diverse inverse problems in statistics (Kerkyacharian et al., 
2007, 2011, 2010). They are especially well adapted to the situation recurrent 
in astrophysics, where the "full sky" is not covered (meaning in our context 
that there are regions of the sphere where the points X{ are not observed if 
they happen to fall there). 

A formal definition of needlets on the sphere is postponed to Appendix A 
(on-line suppement) and can be found in greater details in Narcowich et al. 
(2006b). For the description of the test procedures, we only need to define 
the empirical needlet coefficients 

1 n 



n 

i=l 



dcf. 



which are unbiased estimators of f3jk(f) =' (/, V'jfc) = J§ /(OV'jA^Od^- As 
usual in the wavelet literature, j > refers to the scale and k to the location. 
The coarsest scale is j = 0. The index k refers to a collections of quadrature 
points {£j,fc} that are available at each scale j. ipj^ is then a zero-mean 
function centered on £jk and more and more concentrated as j — > oo. 

4.1. Multiple tests. For multiple tests, we will consider collections of "lin- 
ear estimators" of the density, meaning that we won't use any non-linear pro- 
cessing of wavelet coefficient such as thresholding in the estimation phase. 
By analogy with the work of Butucea and Tribouley (2006) on the related 
problem of the two-sample nonpar ametric homogeneity test, we define 

3=0 k£K, 
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with the f3jk's given by (10). For any value of the smoothing parameter J, 
we define the non-randomized associated testing procedure 



1 if d(fj,g) > tj 
if d(fj,g) < tj 



r ' =1 «/,*)>*,H« .- .;v ,y : -. (12) 



This gives a family of tests indexed by J, where the dependence with respect 
to the choice of the distance d and to the sequence of thresholds tj is made 
implicit in the notation. 

Butucea and Tribouley (2006) proved that if the regularity conditions are 
known and specified by Besov conditions, the smoothing parameter J can 
be chosen optimally. It is likely that their arguments could be reproduced in 
our case. However our point of view in this paper will not be to detail this 
theoretical issue but rather to concentrate on the practical aspects of the 
tests. Moreover, it would be probably difficult to relate physical information 
to mathematical regularity conditions. 

Nevertheless, the optimal choice for the parameter J depends on the reg- 
ularity s specified in the class of alternatives. Adaptive optimality can be 
achieved thanks to a multiple test that decides for the alternative hypoth- 
esis as soon as one of the Tj(d,cj) = 1 individually does so, i.e. defining 
^Multiple u v 



-iMULTIPLE 



if and only if VJ < J\ Tj = 0. (13) 



Mimicking the theoretical results obtained in Butucea and Tribouley (2006) 
and Baldi et al. (2009) we have used 

J*=L|log 2 (n/logn)J (14) 

as reference in our numerical investigations, as in the case of adaptive den- 
sity estimation (see below). Note however that the optimal J* could vary 
according to the loss function {IP norm) we use to measure the non isotropy 
as suggested by the results of the related problem in the two sample nonho- 
mogeneity detection Butucea and Tribouley (2006). The values tj that are 
used in (12) must be chosen to verify f> (2"' MuLTIPLE = q) ~ a w h ere a is the 
prescribed level of the test. 

4.2. Plug-in tests. It is also interesting to compare, from an empirical 
point of view, the above multiple test procedures to algorithms where we 
simply plug-in an adaptive estimate of the density in the distance. These 
density estimators have good asymptotic properties from a minimax point 
of view, hence it makes sense to investigate also their properties when used 
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for testing. To the best of our knowledge no theoretical optimality is proved 
and there even are arguments suggesting that these procedures might not 
be optimal. For instance, on the real line, the minimax rate of convergence 
for estimation (in the so-called dense case) is n~ s < ( 2s+1 ) ) meaning that if / 
belongs to a ball in a Holder space with exponent s, then no estimator can 
approach the least favorable density at a better error rate (measured in a 
LP norm). We refer to ?, Theorem 3 for a precise statement, among others. 
On the other hand, the minimax critical radius for non-uniformity detection 
is n - 2s /( 4s + 1 ) (gee Ingster, 2000). It means that, in the minimax framework, 
one can distinguish asymptotically two hypotheses that are separated by 
a distance negligible with respect to the accuracy of any nonparametric 
estimation of the densities in a infinite dimensional space. 

The most popular minimax adaptive technique consist in adding to a very 
basic linear estimation a thresholding rule as post-processing. In the above 
mentioned paper (Baldi et al., 2009) this nonlinear post-processing actually 
is a hard thresholding rule, namely 

1 J * 

3=0 k&Kj 

for some positive constants k and J* = [3 l°g2( n /l°g n )J • The coefficients 
f3jk are defined in (10). 

It is known that many variations exist with close theoretical properties 
but some differences in different practical situations. Among those, we will 
especially consider the data-driven thresholding introduced by Juditsky and 
Lambert-Lacroix (2004) to deal with density estimation on the real line (as 
opposed to density on [0,1]). It seems to give good detection procedures for 
small samples in our context. In the following, we will consider the non-linear 
estimates 

j* 
1 J- 

fo* = fa + Z> Z> 1 0, k \>XV^na jk 1 ^k>plogn$jk1pjk (15) 

3=1 k&Kj 

for some positive constants p, A, J* = [3 log 2 (n//9logn)J , and where 

def. 1 



°3k 



n 

i=\ 



^f fe PQ)-(/3 ife ) 2 , (16) 

i= 

s 3 k d = (^(^r'E^pQ- (17) 



j=i 
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Let us give a short interpretation of the thresholding procedure. The 
quantity a 2 - k is an estimate of the variance of jijk- The expression for 5jk 
is inspired by the one provided in Juditsky and Lambert-Lacroix (2004). 
In this reference, compactly supported wavelets on the real line are used 
with a threshold on the number of observations actually participating to 
the estimation of f3jk- In this case, it makes sense to count the number of 
observations falling in the support of the wavelet. In our case, as needlets 
are supported on the whole sphere (although very concentrated) , we propose 
to replace this quantity by a continuous type approximation 5jk, see (17). 
Note that 6jk = n if X± = ■ ■ ■ = X n = £jk. 

Finally, we define the PlugIn procedure as the decision 

rpPLUGlN -i /io\ 

1 J ~ i rf(/ J *,s)>t p / LUGlN { ' 

with fj* defined in (15) and £P LUGlN some fixed threshold depending on the 
prescribed level a of the test. 

4.3. Two-point correlation test and nearest neighbour test. When dealing 
with one dimensional data, one can compare every test procedure to the well 
known benchmark Kolmogorov-Smirnov or Cramer-von Mises tests, which 
are based on the empirical distribution function of the sample. In higher 
dimensions (here on the sphere), there is no natural order relation that 
allows to consider such approaches. For sake of comparison, we have run 
some simulations on two different tests found in the astronomical literature. 

Nearest neighbour test. The following statistical procedure has been pro- 
posed by Quashnock & Lamb (1993). We denote it NN, as nearest neighbour. 
For each point Xi, we compute the distance Yi to its nearest neighbour. Un- 
der the hypothesis that / is uniform over the whole sphere, the marginal 
distribution function of (Yi) is <f> : y i->- 1 — [(1 -f cosy)/2] n ~ 1 , and the 
Wilcoxon statistic 

is asymptotically standard Gaussian. For nonhomogeneous random draw 
(for instance, in presence of clusters), this statistic is expected to take sig- 
nificantly high values, allowing to detect this kind of anisotropy. This test 
is of interest as it is simple to compute, it has no parameters to be tuned, 
and that it admits an extension to non uniform sky coverage (see Efron and 
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Petrosian, 1995). In this case, the distribution of W is estimated numerically 
by Monte Carlo methods. The NN procedure simply writes 

T NN = l w > tNN . (19) 

where tf^ a is the (1 — a)-quantile of the distribution of W . This distribution 
can be approximated by a standard Gaussian distribution if the sample size 
is big and the coverage is uniform. Otherwise, the quantile is estimated by 
Monte Carlo method. 

Two-point correlation test. Among others, ?Kachelriess and Semikoz (2006) 
use the empirical two-point auto-correlation function to detect clustering 
(TwoPC test). For a collection of n points {X{\ and any angular distance 
5 G [0, 7r], let N n (5) denote the random number of pairs {i,j} such that 
A(Xi,Xj) < 5, where A is the geodesic distance. Define the two-point cor- 
relation function w n (6) = K(N n (S)) and its empirical counterpart 

w n (5) = Y,l{o,s ] (MX t ,X J )). (20) 

i<j 

Under the null hypothesis, the distribution of w n at any 5q is evaluated 
using Monte Carlo simulations. Then, the detection will be based on the 
comparison between the empirical correlation function and w n , at some fixed 
value $o or a few different values. A typical 5q can be chosen so as to maximize 
the sensitivity of the test depending on the application. In some references 
however the probability to observe a value bigger that w n {5) is plotted on 
the whole range [0, it] with no 5q fixed a priori. Consequently, much care be 
taken when interpreting those values, as stressed for instance in Kachelriess 
and Semikoz (2006). Here we define the procedure TwoPC by the decision 

rpTWOPC -1 /oi \ 

1 = 1 u. n (<5„)>t T «oPC. (21) 

where t]\^° is the (1 — a) quantile of the distribution of w n (5o) under the 
null, evaluated by Monte Carlo simulations, at some <5o specified a priori. 

5. Monte Carlo experiments. 

5.1. Experimental setup. In this Section we compare numerically the 
tests defined in Section 4 that are denoted Multiple, PlugIn, NN, and 
TwoPC. 

For T being any of those non randomized test procedures, we can tune the 
parameters of the procedure to have a prescribed level a, i.e. F g (T = 1) = a. 
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This is done by Monte Carlo replication. Ten thousand independent random 
samples of size n are drawn under the null hypothesis, for g being the uniform 
density on S ( i.e. g = 1/(4-71"))) or the stylized coverage function of the Pierre 
Auger detector (see Figure 1). 

For the Multiple procedure and a given level a, we have chosen 

Tj = I,,? ,, ^ (22) 

where t a ij is the I — a' quantile of the distribution of \\fj —g\\ p under the null 
hypothesis. This distribution is evaluated using Monte Carlo replications. 
Further, the value a 1 is chosen so that 

T'j*= sup T 3 (23) 

has a first type error probability equal to a. This is arbitrary and the theory 
to be written would likely suggest to use a scale dependent level. 

The power of the test T is defined by (7). Some clues about this value 
are obtained by evaluating ¥f(T = 1) for particular alternatives / that are 
given in the next Section. Here again, those quantities are evaluated by 
Monte Carlo. Note however that the power for a particular alternative only 
gives an upper bound of the power in the minimax sense given by the second 
equation of (7). 

In the following tables of tests we represent the power of the needlet tests 
as a function of the finest band J* and the power of the norm we use to detect 
anisotropy (see Appendix A for more details on the actual implementation 
of the method). 

The profile cuts of the (axisymmetric) needlets we have used are plotted 
the online supplement. 

5.2. Alternatives. We have investigated the performance of the test (power 
against level) for sample sets of small to moderate size (n = 25, 100, 400) 
and against different alternatives. Those choices of n mimics the progres- 
sive publication of events by the Pierre Auger Observatory (27 events above 
5.7 x 10 19 eV in 2008, 69 above 5.5 x 10 19 in 2010, a few hundreds in the 
future) . 

Generally speaking, the physical plausibility of those alternatives is weak 
(alternative (Hf)), if not null (alternatives (H^) and (Hf)). Our goal is to 
focus here on specific departures from isotropy. First we consider unimodal 
non isotropic densities, with a Gaussian shape. Then we consider mixtures 
of densities that would only be obtained if the sources of the cosmic rays 
were known to be uniformly distributed and repeating, and at the same 
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11.62 0.07 




Fig 3: Densities (first line) and random draws (second line, n = 100) under 
10% and 9 = 5° (left) or 9 = 20° (right). 



(JTf) with 5 



distance from us. Third, the Physics-inspired model (Hf) give rise to non- 
isotropic patterns with richer frequency content compared to the previous 
ones (and non axisymmetric clusters). We now give the precise definitions 
of the alternatives. 

(Hf). The first family of alternatives is obtained as a mixture of the uni- 
form density /o and an over-density at some point of the sphere, with 
Gaussian-like axisymmetric profile. Precisely, the density under (Hf ) writes 

/(£) = 0—S)fo + She® 

where MO := fy&CO, h<,fy := C e exp(-(£ • £ o ) 2 /20 2 ) and & = (*r/2,0). 
Such densities are then unimodal, with a bump whose width is proportional 
to 9. Typical observations of random draw with such density with 5 = 0.01 
and 9 = 5° or 20° are displayed on Figure 3. 

(Hi). A second family of alternative is a toy model for the repeating emis- 
sion of events from a small number of sources, as explained in the Introduc- 
tion. Here we assume that the n s sources are uniformly distributed, although 
in a realistic case, we can expect any type of astrophysical sources to follow 
the local matter density of the cosmic structure (which would make the de- 
tection of anisotropy easier). This generalisation is straightforward enough 
that we do not discuss it further at this stage. Conditionally to those posi- 
tions, the Xi are distributed along a mixture of n s Gaussian densities centred 
on the sources (to take into account the error in the measurement of the in- 
cidence angle or the deflection of the charged particle by Galactic magnetic 
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Fig 4: Density of X\ conditionally to the random draw of the centers of 100 
AGNs for a uniform coverage (first line) and random draws with n = 400 
(second line). The coverage is uniform on the left, a la South Pierre- Auger 
observatory on the right. 



fields) multiplied by the coverage function (g) of the instrument. Namely 



/(O = 0(OJl>,&(O 



Such conditional densities are displayed on the first line of Figure 4. We 
considered the cases n s = 10 and n s = 100 and fixed 9 = 10°. Note that 
if n s is much bigger that n, it is difficult to detect this kind of anisotropy 
(which can be detected only if at least one source has emitted more than 
one cosmic ray). 

(Hf). A third and last alternative is obtained by the physical model of cos- 
mic ray observations described in detail in Section 2. Sources are randomly 
drawn in a spherical volume of radius r max = 70 Mpc, and their flux is as- 
sumed inversely proportional to the square of their distance. The parameters 
for the simulations are taken to be E milx = 10 21 eV, a = 4.2. We consider dif- 
ferent values for .Emm (namely 1, 4 or 6 xl0 19 eV). Playing on this parameter 
has an important practical incidence. Assuming that the distribution of the 
energy of the cosmic rays is a power law, ¥(E > t) ~ Ct~ a+1 , lowering the 
threshold on the selection of the cosmic rays from 6 x 10 19 eV to 4 x 10 19 eV 
(resp 10 19 eV) accounts to increase the size of the sample (available observa- 
tions above the threshold) by a factor (6/4) Q_1 ~ 3.66 (resp 310). It means 
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(a) £ min = 10 19 eV 



(b) Emin = 4 x 10 19 eV (c) E mia = 6 x 10 19 eV 



Fig 5: Isotropization of the cosmic rays in model (Hf) as £ m ; n decreases. 
There are the exact same n s = 30 sources in the three cases and n = 1000 
observations. 



that the statistical decision should be made far easier if the cosmic rays were 
not too much isotropized by the Galactic fields as their energies goes lower. 
This effect is illustrated in Figure 5. It is interesting to see if the methods 
are still able to detect anisotropy as the cosmic rays become more and more 
isotropized. This is a more realistic simulation compared to models (Hf) and 
(Hi). There is no single size for the scatter of the CRs coming for a given 
source, nor the same size or directionality for each source, nor the same flux 
for each source, that hence is interesting specifically for a multiscale analysis 
with no prior assumption about a correlation length. 

Note that under the alternatives (H\) and (Hi), the procedure is to be 
understood as a test on the conditional distribution of (-Xj)i = i r .. )n with re- 
spect to the positions of the "sources" , which are randomly drawn once for 
all. 

5.3. Numerical results and discussion. 

Tables. We shall represent some of the results of our simulations with ta- 
bles of estimated power of the procedures for given alternatives (in percent), 
at the prescribed level a = 0.05. Practically, we let the finest needlet band 
entering the Multiple and PlugIn procedures vary in the set { J* — 2, J* — 
1, J*, J* + 1} where J* is given by (14). The entry (or entries) correspond- 
ing to the overall highest power (before rounding off) among the 26 values 
is (are) printed in bold type. We consider three L p norms, namely LP for 
p = 1,2, oo. It is possible to use an unbiased estimate of the distance be- 
tween / and g in the case of the L 2 norm. It is refered to as p = 2* (see 
Appendix A. 2. 3 in the on-line supplement for details) 

ROC curves. The receiver operating characteristic (ROC) curves plot the 
power p of a procedure as a function of its level a. It is a useful representation 
for comparison of different procedures along a wide range of levels. The ROC 



imsart-aoas ver. 2011/11/15 file: densityCR.tex date: June 26, 2012 



TESTING THE ISOTROPY OF HIGH ENERGY COSMIC RAYS 25 

curves associated to the TwoPC procedure are step function because of 
the discrete nature of the test statistic. Some of the ROC curves are non 
concave. It should be recalled however that any procedure of this kind can be 
improved to a randomized procedure whose ROC curve is the concave upper 
envelope of the original one. Accordingly, the reader's eyes must actually 
analyse the upper envelopes of the ROC curves. Note that the power in the 
tables have not been modified by this argument. 

ROC curves are represented in plots with four subplot, corresponding to 
the four above-mentioned choices of J* in the needlet methods. The ROC 
curves for TwoPC and NN procedures are the same in the four subplots. 
Inset graphs allow complementary comparison of the methods by zooming 
on the most relevant levels (small a). 

5.3.1. Some specific results. Figure 7 illustrates that the methods Mul- 
tiple and PlugIn have a consistent behaviour when the typical radius of 
anisotropic structure is varying. The model we consider here is {Hf) with 
9 = 5° or 9 = 20°. We shall discuss further from those cases below. Figure 8 
illustrate the good performances even for small samples under the model 
(Hf ) that produces clusters of various sizes and shapes. 

The differences of sensitivity between the different norms are not very 
strong, probably because we consider quite regular alternative hypotheses. 
As expected, the L°° is a bit more sensitive to more spiky (unimodal) dis- 
tributions, whereas more global measures such as L 1 or L 2 perform better 
under the {H\) or (H^) models. This is illustrated by the ROC curves of 
Figure 13 in the online supplement. 

Under the alternative (H\) the NN procedure performs strikingly well, 
as illustrated by the power of the four procedures in Table 5. Recall that 
there is no parameter to tune in this method. Those good results can be 
explained in the following manner. Under {H\), the points {-Xj}j=i,...,n are 
mainly grouped into clusters of average scale given by the standard deviation 
of the Gaussians of the mixture. If the number of clusters and this standard 
deviation are too small to cover significantly the whole sphere, then the 
random distances to the nearest neighbour are bounded by a with very high 
probability, which is not the case under the null. This makes the distribution 
of the distance to the nearest neighbour a very sensitive tool to discriminate 
between (H^) and the null. 

Varying the alternatives, it appears that no method outperforms the other 
in a uniform way, but it seems that the two needlet methods, if not always 
optimal, consistently have a good behaviour. Moreover, the Multiple test is 
slightly more sensitive that the PlugIn one. As an illustration, we represent 
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in Tables 3 to 8 the results of the simulation for a quite representative 
panel of alternatives, containing more or less spiky distributions, clusters of 
smoother alternatives, weak or strong anisotropy etc. 

5.3.2. Sample size. It is important to notice that the TwoPC approach 
often provides a good sensitivity if not the best at n = 25. For most of the 
alternative, however, one or the other of the needlets methods outperforms 
TwoPC as n grows. This is exemplified in Tables 3 to 8. For n = 100, the 
needlet methods are more sensitive, whereas for n = 25 they are usually less 
sensitive than TwoPC. As already noticed, NN is consistently better for 
the model (H\) (see Table5). 

In our application context, the sample size over a given energy threshold 
is increasing with time and experiments, so it must be highlighted that 
multiscale methods are more and more appropriate for analysis of future 
datasets. 

5.3.3. Separation rate. We focus here on the behavior of the power of 
the test with respect to n. If r n is the critical rate in the minimax sense 
(given by Equations 8 and 9), we should observe an approximately same 
power for different sample size and the least favorable alternative densities 
f n as soon as the quantity r n d(f n ,g) remains constant. On Table 9, we have 
displayed the power of the different procedures for three different densities 
corresponding to the alternative (Hf) and three sample sizes, keepin the 
same value for n l > 2 d(f,g). Indeed, in the (Hf) case, for any power norm, 
d(f, /o) = Sd(fo, ho). As the power remains roughly in (0, 1) the same for the 
three values of the parameters, and as n 1 ' 2 is an upper bound four the min- 
imax separation rate in analogy with similar problems on Euclidean spaces, 
this numerical simulations is consistent with the claim that the needlet based 
procedures perform well at the minimax rate of testing. The increasing value 
of the power with n together with the unbeatable rate of separation \fn illus- 
trates the fact that we only have access to upper bounds of the minimax rate. 
In other words, the densities under consideration are definitly not the least 
favorable cases. The comparison of needlet methods with NN and TwoPC 
methods tends to be more favourable to needlets methods as n becomes 
larger in this case. 

5.3.4. Robustness. Assume that the anisotropy detection by the needlet 
methods is adaptive. Then, as pre-tuned black boxes, those methods should 
remain optimal on a wide range of alternatives. Some simulations support 
this claim. Note however that we only explore physically possible alternative 
which are smooth non uniform densities. 
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The key parameter of the TwoPC method is the angular size 5o at which 
we compare w(6) to the distribution of w(5q) under the null. For sake of 
fairness in our comparisons, we should allow some tuning of this parame- 
ter. It is clear that the optimal <5o is related to the "average scale" of the 
anisotropy. Though it is difficult to give a precise and general definition of 
this former quantity, it should be close to the value of the parameter 9 in the 
particular case of model (Hf). Figure 7 has been obtained for two densities 
of this model, with 5q = 3°. It appears that TwoPC is better than the 
needlet methods when 9 = 5° and worse when 9 = 20°. 

On Figure 9 we have plotted the estimated power of the tests against 
different alternative (Hf), with different parameters, and for different pa- 
rameters for the methods. The figure shows that the optimal do is indeed 
related to the parameters 9 of the alternative. However, when dealing with 
alternative that give rise to structures at different "scales", the optimal 
choice of 5o is not clear. This is illustrated by Figure 10, where models {Hf) 
are under consideration. By observing the large variations of the power of 
the TwoPC procedure with respect to 5o in both cases, one can conclude 
that this procedure should incorporate a data-driven selection of 5o to be 
truly efficient. 

The situation is strikingly different for the needlet methods. One can ob- 
serve from the left columns of Figures 9 and 10 that the power reaches some 
plateau after J* > J m i n in a very consistent way across the different alterna- 
tives. This robustness is a strong point of those methods. The dependence 
in n is quite weak too. For instance, taking J* = 4 leads to a small loss of 
efficiency uniformly with respect to the best choice for each given situation 
of sample size and model. 

6. Analysis of Auger data. We have run the previous tests on the 
Auger public data made available by (The Pierre AUGER Collaboration 
et al., 2010). It is composed of 69 arrival directions of cosmic rays with 
energy above 55 EeV and detected by the Pierre Auger Observatory between 
1 January 2004 and 31 December 2009. Those directional events are plotted 
on Figure 6. The distributions of the tests under study for n = 69 and under 
the null hypothesis has been evaluated by Monte-Carlo simulation of length 
10.000. 

Along with the detection of a correlation between cosmic rays directions 
and catalogues of potential sources, the Pierre Auger collaboration already 
performed a catalogue-free test for anisotropy with no reference to any cat- 
alogue, using the TwoPC procedure. As noticed earlier, the critical value 
for this method is the choice of 5o in (21). The p-value of this test for the 
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Fig 6: The 69 arrival directions of cosmic rays with energy above 55 EeV 
and detected by the Pierre Auger Observatory up to 31 December 2009 (The 
Pierre AUGER Collaboration et al, 2010) 



69 UHECRs data set reaches a minimal value of 

p-value(TwoPC) ~ 0.008 

around So ~ 10.7°. Recall that in order to be interpretable as a classical p- 
value for a single hypothesis testing, this p-value should have be computed 
from an out-of-the-sample prescription of 5q, which is not the case here. Then 
this p-value strongly exaggerates the significance of the detection. Indeed, 
as already noticed in (The Pierre AUGER Collaboration et al., 2010), we 
computed that the fraction of isotropic simulations that are as non-isotropic 
as the real data at some angle between 4° and 14° is as high as 10%. We 
have also computed a that 

p-value(NN) ~ 0.07 . 

The p-values of Table 1 are the p-values computed from the Pierre Auger 
data set for our Multiple and PlugIn procedure. 

For the Multiple test, the p- value is defined as the proportion of draws 
(under the null) that have a higher single test statistic in at least one value 
of j 6 {1, . . . , J*}. The resulting p- value is quite sensitive to the choice of 
the highest band J*, except if one uses the L 2 -norm. Note that if we take the 
L 2 norm and the theoretical J* = 2 given by the expression (14), the results 
for the Multiple test are not statistically significant. But the Monte-Carlo 
simulations suggest that this theoretical choice of J* is not optimal for small 
to medium sample size, being to small. 

The PlugIn is more stable and consistently considers that the Auger 
data is significantly non isotropic. The almost constant p- values in this case 

imsart-aoas ver. 2011/11/15 file: densityCR.tex date: June 26, 2012 



TESTING THE ISOTROPY OF HIGH ENERGY COSMIC RAYS 



29 



J* 


p=l 


p = 2* 


p — oo 


1 


0.957 


0.788 


0.387 


2 


0.051 


0.112 


0.035 


3 


0.118 


0.050 


0.004 


4 


0.434 


0.046 


0.003 


5 


0.227 


0.095 


0.624 


6 


0.762 


0.045 


0.341 



J* 


p=l 


p = 2* 


p — oo 


1 


0.956 


0.958 


0.397 


2 


0.033 


0.037 


0.036 


3 


0.017 


0.008 


0.005 


4 


0.017 


0.008 


0.008 


5 


0.017 


0.008 


0.008 


6 


0.017 


0.008 


0.008 



(a) Multiple test (b) PlugIn test 

Table 1 

P-values of the Multiple and the PlugIn tests for Auger data (n = 69,). 



is the consequence of a hard thresholding rule in (15) that cancels all the 
estimated coefficients fijk as soon as j > 3 for this data set. This may in 
turn give a rule-of-thumb rule to define a data-driven J* for the multiple 
test. 

To conclude on this important data set and this methodology, it appears 
that the needlet methods find a stronger statitstical evidence of some kind 
of anisotropy in the Pierre Auger data. More realistic alternatives and more 
simulations can help to choose the J* parameter of the Multiple procedure, 
and additional parameters of the PlugIn approach. 

7. Conclusion. In this paper, we have investigated the problem of the 
detection of anisotropy of directional data on the unit sphere, with an ap- 
plication to the analysis of ultra high energy cosmic ray events as observed 
with a detector such as the Pierre Auger observatory. It was important to 
consider samples whose sizes are comparable to the sizes of the data sets that 
are available nowadays for cosmic rays scientists (about 25 at the beginning 
of this work, about a hundred now). Although we are mainly interested 
in small sample performances, we have proposed a multiple test approach 
based on a multiresolution analysis of the data, which could hopefully be 
proved to be asymptotically optimal in the minimax sense, a well-known 
pessimistic framework. 

We have proposed, and tested on various simulated data sets, two meth- 
ods using the decomposition of the directional data onto a frame of spherical 
needlets. Their performance has been compared to other (more specific) ap- 
proaches based on the nearest neighbour and on the two point correlation 
function. The simulation shows that the needlet-based methods perform 
comparatively very well in various situations. They are competitive with 
existing method at small sample size, and tend to outperforms them from 
moderate sample size. Moreover, the "omnibus" property of the needlets 
method is interesting for the problem at hand, in which the type of pos- 
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sible anisotropy (the class of alternative) is not really well known a priori. 
In addition, a multiple test based on the use of spherical needlets offers a 
good opportunity to extend the method of detection of anisotropies with 
not only multiplicity in the scales tested, but also in ranges of energy of 
the incoming particles. Indeed, while in this work we have used the energy 
level as a simple threshold, one could instead implement a detection using 
the joint directional-energy information - allowing thus to simultaneously 
extract information from the highest energy cosmic rays, which are not de- 
flected much by Galactic and extragalactic magnetic fields, and also from 
lower energy events, more deflected but much more numerous. In the light 
of our simulations on an energy level-dependent model, multiscale approach 
could lead to stronger conclusion using the CR data that are not yet made 
public by the Pierre Auger Observatory. 

As in any nonparametric method, there is at least one parameter to be 
tuned, often by hand or using more sophisticated data-driven methods such 
as cross-validation. In the needlet methods one can tune several parameters 
(shape of the needlets, highest scale J* — although there's is an asymptotic 
formula for it — , thresholds on the coefficients in the PlugIn approaches, 
thresholds on the individual tests in the Multiple procedure, power norm). 
It is plausible however that a large range of possible choices for most of these 
parameters give comparable performance. 

Although we have used needlets that are compactly supported windows in 
the harmonic space, it may be arguable that they are not the most appropri- 
ate tool. One could consider, as an alternative, better spatially concentrated 
functions (such as the Mexican needlets, see e.g. Lan and Marinucci, 2009) 
or, in general, try to optimize the needlet window function given prior knowl- 
edge of the physical problem and of the expected properties of anisotropic 
distributions of the cosmic ray direction of incidence. In this spirit, it would 
be interesting to consider directional wavelet such as curvelets or ridgelets 
(see Starck et al., 2006) to test for specific strip-like alternative densities. 
It is also possible to consider non-dyadic needlets. The choice of B £ (1,2) 
allows a finer coverage of the frequency line. The numerical results presented 
here have not taken this benefit into full account, and whether significantly 
higher power can be obtained by optimizing this number remains to be in- 
vestigated. 

Finally, in addition to the aforementioned possible extensions of our meth- 
ods, we want to stress that the work presented here also opens the way to 
two lines of future investigations, one on the applications side, and one more 
theoretical. On the experimental side, it will be of much interest to apply the 
method on larger data sets (for instance by lowering the energy threshold 
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to increase the available sample size). On the theoretical side, the validation 
of the approach has to be investigated on the basis of some theory in the 
minimax framework it is designed for. 

Acknowledgment. For the numerical part of this work, we acknowledge 
the use of the HEALPix package (Gorski et al., 2005) and of the SphereLab 
(an Octave interface to HEALPix package, needlet transforms, and utilities). 
We thank Dmitri Semikoz for useful discussions concerning high energy cos- 
mic rays, and Maude le Jeune, Jean-Francois Cardoso and Frederic Guilloux 
for making available some of the tools used for the computational aspects of 
the present work. We thank the anonymous referees and associate editor for 
their suggestions and remarks that improved the presentation of this paper. 

References. 

J. Abraham et al. Observation of the Suppression of the Flux of Cosmic Rays above 
4xl0 19 eV. Physical Review Letters, 101(6):061101-+, August 2008. . 

P. Baldi, G. Kerkyacharian, D. Marinucci, and D. Picard. Asymptotics for spherical 
needlets. On arXiv.org : math.ST/0606599, 2006. 

P. Baldi, G. Kerkyacharian, D. Marinucci, and D. Picard. Subsampling Needlet Coeffi- 
cients on the Sphere. ArXiv e-prints, 706, June 2007. 

P. Baldi, G. Kerkyacharian, D. Marinucci, and D. Picard. Asymptotics for spherical 
needlets. Ann. Statist., 2008. (In press). 

P. Baldi, G. Kerkyacharian, D. Marinucci, and D. Picard. Adaptive density estimation for 
directional data using needlets. Ann. Statist, 37(6A):3362-3395, 2009. ISSN 0090-5364. 

Pijushpani Bhattacharjee and Guenter Sigl. Origin and propagation of extremely high 
energy cosmic rays. Physics Reports, 327:109-247, March 2000. . 

Cristina Butucea and Karine Tribouley. Nonparametric homogeneity tests. J. Statist. 
Plann. Inference, 136(3):597-639, 2006. ISSN 0378-3758. 

Ingrid Daubechies. Ten lectures on wavelets, volume 61 of CBMS-NSF Regional Confer- 
ence Series in Applied Mathematics. Society for Industrial and Applied Mathematics 
(SIAM), Philadelphia, PA, 1992. 

J. Delabrouille, J.-F. Cardoso, M. Le Jeune, M. Betoule, G. Fay, and F. Guilloux. A full 
sky, low foreground, high resolution CMB map from WMAP. Astron. Astrophys., 493: 
835-857, January 2009. . 

David L. Donoho, Iain M. Johnstone, Gerard Kerkyacharian, and Dominique Picard. Den- 
sity estimation by wavelet thresholding. Ann. Statist., 24(2):508-539, 1996. ISSN 0090- 
5364. . URL http://dx.doi.org/10.1214/aos/1032894451. 

B. Efron and V. Petrosian. Testing Isotropy versus Clustering of Gamma-Ray Bursts. 
Astrophys. J., 449:216-+, August 1995. . 

G. Fay and F. Guilloux. Consistency of a needlet spectral estimator on the sphere. Sta- 
tistical Inference for Stochastic Processes, 14:47-71, 2011. 

G. Fay, F. Guilloux, M. Betoule, J.-F. Cardoso, J. Delabrouille, and M. Le Jeune. CMB 
power spectrum estimation using wavelets. Physical Review D (Particles, Fields, Grav- 
itation, and Cosmology), 78(8):083013, 2008. . 

Magalie Fromont and Beatrice Laurent. Adaptive goodness-of-fit tests in a density model. 
Ann. Statist, 34(2):680-720, 2006. ISSN 0090-5364. 

imsart-aoas ver. 2011/11/15 file: densityCR.tex date: June 26, 2012 



32 G. FAY ET AL. 

Evarist Gine. Invariant tests for uniformity on compact Riemannian manifolds based on 
Sobolev norms. Ann. Statist, 3(6):1243-1266, 1975. ISSN 0090-5364. 

K. Gorski, E. Hivon, A. Banday, B. Wandelt, F. Hansen, M. Reinecke, and M. Bartelmann. 
HEALPix: A Framework for High-Resolution Discretization and Fast Analysis of Data 
Distributed on the Sphere. Astrophys. J., 622:759-771, April 2005. . Package available 
at http://healpix.jpl.nasa.gov. 

K. Greisen. End to the Cosmic-Ray Spectrum? Physical Review Letters, 16:748-750, April 
1966. . 

F. Guilloux, G. Fay, and J.-F. Cardoso. Practical wavelet design on the sphere. Appl. 
Comput. Harmon. Anal, 26(2): 143-160, march 2009. 

D. Harari, S. Mollerach, and E. Roulet. Astrophysical magnetic field reconstruction and 
spectroscopy with ultra high energy cosmic rays. Journal of High Energy Physics, 7: 
6-+, July 2002. . 

A. M. Hillas. The Origin of Ultra-High-Energy Cosmic Rays. Annual Review of Astronomy 
and Astrophysics, 22:425-444, 1984. . 

Yu. I. Ingster. Asymptotically minimax hypothesis testing for nonparametric alternatives. 
I, II, III. Math. Methods Statist., 2(2):85-114, 171-189, 249-268, 1993. 

Yu. I. Ingster. Adaptive chi-square tests. J. Math. Sciences, 99:1110-1120, 2000. 

Yu. I. Ingster and I. A. Suslina. Nonparametric goodness- of -fit testing under Gaussian 
models, volume 169 of Lecture Notes in Statistics. Springer- Verlag, New York, 2003. 
ISBN 0-387-95531-3. 

Anatoli Juditsky and Sophie Lambert-Lacroix. On minimax density estimation on R. 
Bernoulli, 10(2): 187-220, 2004. ISSN 1350-7265. 

Michael Kachelriess and D. V. Semikoz. Clustering of ultra-high energy cosmic ray arrival 
directions on medium scales. Astropart. Phys., 26:10-15, 2006. . 

G Kerkyacharian and D. Picard. New generation wavelets associated with statistical 
problems. In The 8th Workshop on Stochastic Numerics, pages 119-146. Research 
Institute for Mathematical Sciences, Kyoto University, Jan. 2009. 

Gerard Kerkyacharian, Pencho Petrushev, Dominique Picard, and Thomas Wilier. Needlet 
algorithms for estimation in inverse problems. Electron. J. Stat., 1:30-76 (electronic), 
2007. ISSN 1935-7524. 

Gerard Kerkyacharian, George Kyriazis, Erwan Le Pennec, Pencho Petrushev, and Do- 
minique Picard. Inversion of noisy Radon transform by SVD based needlets. Appl. 
Comput. Harmon. Anal, 28(l):24-45, 2010. ISSN 1063-5203. . 

Gerard Kerkyacharian, Thanh Mai Pham Ngoc, and Dominique Picard. Localized decon- 
volution on the sphere. The Annals of Statistics, 39(2):1042-1068, 2011. 

Xiaohong Lan and Domenico Marinucci. On the dependence structure of wavelet coeffi- 
cients for spherical random fields. Stochastic Process. Appl, 119(10):3749-3766, 2009. 
ISSN 0304-4149. . URL http://dx.doi.Org/10.1016/j.spa.2009.07.005. 

E. L. Lehmann and Joseph P. Romano. Testing statistical hypotheses. Springer Texts in 
Statistics. Springer, New York, third edition, 2005. ISBN 0-387-98864-5. 

D. Marinucci, D. Pietrobon, A. Balbi, P. Baldi, P. Cabella, G. Kerkyacharian, P. Natoli, 
D. Picard, and N. Vittorio. Spherical Needlets for CMB Data Analysis. M.N.R.A.S., 
383(2) :539-545, January 2008. 

V. J. Martinez and E. Saar. Statistics of the Galaxy Distribution. Chapman & Hall/CRC, 
2002. 

Yves Meyer. Wavelets and operators, volume 37 of Cambridge Studies in Advanced Math- 
ematics. Cambridge University Press, Cambridge, 1992. ISBN 0-521-42000-8; 0-521- 
45869-2. Translated from the 1990 French original by D. H. Salinger. 

F. Narcowich, P. Petrushev, and J. Ward. Decomposition of Besov and Triebel-Lizorkin 

imsart-aoas ver. 2011/11/15 file: densityCR.tex date: June 26, 2012 



TESTING THE ISOTROPY OF HIGH ENERGY COSMIC RAYS 33 

spaces on the sphere. J. Fund. Anal., 238:530-564, September 2006a. 

F. J. Narcowich, P. Petrushev, and J. D. Ward. Localized tight frames on spheres. SIAM 
J. Math. Anal, 38(2):574-594, 2006b. 

P. Petrushev and Y. Xu. Localized polynomials frames on the ball. Constr. Approx., 27: 

121-148, 2008. 
D. Pietrobon, A. Amblard, A. Balbi, P. Cabella, A. Cooray, and D. Marinucci. Needlet 

detection of features in the WMAP CMB sky and the impact on anisotropics and 

hemispherical asymmetries. Physical Review D, 78(10):103504 — h, November 2008. . 
Davide Pietrobon, Amedeo Balbi, and Domenico Marinucci. Integrated Sachs-Wolfe effect 

from the cross correlation of WMAP3 year and the NRAO VLA sky survey data: New 

results and constraints on dark energy. Phys.Rev. D, 74, 2006. 
J. M. Quashnock and D. Q. Lamb. Evidence for the Galactic origin of gamma-ray bursts. 

M.N.R.A.S., 265T45-L50, December 1993. 
0. Rudjord, F. K. Hansen, X. Lan, M. Liguori, D. Marinucci, and S. Matarrese. An Esti- 
mate of the Primordial Non-Gaussianity Parameter /nl Using the Needlet Bispectrum 

from WMAP. The Astrophysical Journal, 701:369-376, August 2009. . 
V. G. Spokoiny. Adaptive hypothesis testing using wavelets. Ann. Statist., 24(6):2477- 

2498, 1996. ISSN 0090-5364. . URL http://dx.doi .org/10. 1214/aos/1032181163. 
J.-L. Starck, Y. Moudden, P. Abrial, and M. Nguyen. Wavelets, ridgelets and curvelets 

on the sphere. Astronomy & Astrophysics, 446:1191-1204, 2006. 
The Pierre AUGER Collaboration, J. Abraham, et al. Correlation of the highest-energy 

cosmic rays with the positions of nearby active galactic nuclei. Astroparticle Physics, 

29:188-204, April 2008. . 
The Pierre AUGER Collaboration, P. Abreu, M. Aglietta, E. J. Ahn, D. Allard, 

I. Allekotte, J. Allen, J. Alvarez Castillo, J. Alvarez-Muniz, M. Ambrosio, and et al. 

Update on the correlation of the highest energy cosmic rays with nearby extragalactic 

matter. Astroparticle Physics, 34:314-326, December 2010. . 
D. F. Torres and L. A. Anchordoqui. Astrophysical origins of ultrahigh energy cosmic 

rays. Reports on Progress in Physics, 67:1663-1730, September 2004. . 
Hans Triebel. Theory of function spaces. II, volume 84 of Monographs in Mathematics. 

Birkh&user Verlag, Basel, 1992. ISBN 3-7643-2639-5. . 

G. T. Zatsepin and V. A. Kuz'min. Upper Limit of the Spectrum of Cosmic Rays. Soviet 
Journal of Experimental and Theoretical Physics Letters, 4:78 — h, August 1966. 



imsart-aoas ver. 2011/11/15 file: densityCR.tex date: June 26, 2012 



34 



G. FAY ET AL. 





J 


k 


3 


4 


5 


6 




P = 


: 1 


77 


100 


100 


100 


Multiple 


P = 


2* 


71 


100 


100 


100 




P = 


OO 


26 


36 


42 


43 




P = 


: 1 


61 


61 


61 


61 


PlugIn 


P = 


= 2 


60 


60 


60 


60 




P = 


00 


31 


36 


36 


36 


NN 










99 




TwoPC 










60 







J 


k 


3 


4 


5 


6 




P = 


: 1 


96 


100 


100 


100 


Multiple 


P = 


2* 


78 


100 


100 


100 




P = 


OO 


56 


66 


66 


63 




P = 


: 1 


68 


68 


68 


68 


PlugIn 


P = 


= 2 


68 


72 


72 


72 




P = 


00 


42 


52 


52 


52 


NN 










LOO 




TwoPC 










80 





(a) Uniform exposure (b) Auger exposure 

Table 2 

Power (in %) under (H\) with number of sources n s = 100, sample size n = 25, and for 

tests of size a — 0.01. The increase of power can be understood by the growing effective 

density of observations, at fixed sample size, when restricting the field of observation 

from the whole sphere to the Pierre-Auger acceptance 
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Table 3 

Power (in %) under (Hi), under Pierre-Auger exposure, with S — 0.04 and 6 — 5. 
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Table 4 

Power (in %) under (Hi), under Pierre-Auger exposure, with 5 — 0.08 and 9 — 5. 
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Table 5 

Power (in %) under (Hi), under Pierre-Auger exposure, with M = 100 and 8 — 10. 
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(a) n = 25 (b) n = 100 

Table 6 

Power (in %) under (Hi), under uniform exposure, with n s — 100 and Emin — 10 19 . 
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(a) n = 25 (b) n = 100 

Table 7 

Power (in %) under (Hi), under uniform exposure, with n s — 500 and Emin — 10 19 . 
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Table 8 



(b) n = 100 



Power (in %) under {HI), under uniform, exposure, with n s — 500 and -Emm 
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(a) n = 25, S = 0.08 



(b) n = 100, 8 = 0.04 
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(c) n = 400, S = 0.02 

Table 9 

Power of the tests for three models of (Hf) with values of S and sample size varying so 

that \/nd{f,g) remains constant. It appears that those particular sequences of powers are 

generally non decreasing with the sample size. 
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n = 100, p = 2, model = H-1-expos-unif-delta-0.08-nbdeg-5 
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Fig 7: ROC curves for the four different methods and the model (Hf ) with 
n = 100, S = 8%, and two values of 9. For the needlet methods, the L 2 norm 
is used. It is illustrated here that the behavior of the needlet based tests do 
not vary a lot with respect to 9, contrary to the TwoPC method. See also 
the text for details and Figure 9. A ROC plots the powers (or true positive 
rate) in q]jdJ£tati aga)nj;t ^ftjfc§fh^UP 1 dl^^yMS|fe r %^ie)jilSe^ci^ft2 
Insets display the same curves as in the main plot with a logarithmic scale in 
abcissas, to highlight the comparative performances for relevant level values, 
in 
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n = 25, p = 2, model = H-3-expos-unif-nsources-500-Emin-1e+19-Emax-1e+21-alpha-4.2-rmax-70 




(a) n = 25 



n = 100, p = 2, model = H-3-expos~unif~nsources-500~Emin-6e+19-Emax-1e+21-alpha-4.2-rmax-70 
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Fig 8: ROC curves for the four methods. For the needlet methods, the de- 
biased 1? norm is used. A ROC plots the powers (or true positive rate) in 
ordinate against the test level (or false positive rate) in abscissa. Insets dis- 
play the same curves as in the main plot with a logarithmic scale in abcissas, 
to highlight the comparative performances for relevant level values. 
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Fig 9: The empirical power associated with the Multiple (left column) and 
TwoPC (right column) procedures at prescribed levels a = 5% with respect 
to their key parameters J* and So, respectively. The prescribed levels of the 
tests are 5%. The three models under consideration are provided by the 
alternative (Hf) with 9 = 5°, 10° and 20°. The number of observations is 
n = 25 on the first line, n = 100 on the second line, n = 400 on the third 
line. 
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Fig 10: The empirical power associated with the Multiple (left column) 
and TwoPC (right column) procedures at prescribed levels a = 5%, with 
respect to their key parameters J* and 6q, respectively. The three models 
under consideration are provided by the alternative (-fff) with n s = 500, 
and -Emm = 10 19 , 4.10 19 or 6.10 19 eV. The number of observations is n = 25 
on the first line, n = 100 on the second line, n = 400 on the third line. 
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APPENDIX A: ON-LINE MATERIAL: NEEDLET CONSTRUCTION 

AND PRACTICAL USAGE 

A.l. Construction of the needlets on the sphere in a nutshell. 

A frame is a collection of functions with properties close to those of a basis. 
Tight frames share many properties with orthonormal bases, though they 
are redundant (see Daubechies, 1992, for details). 

Needlets were introduced by Narcowich et al. (2006b) as a particular con- 
struction of a wavelet tight frame on the sphere. They have been studied 
in a statistical context ( e.g. Baldi et al., 2006; Baldi et al., 2007)) and 
have also been used recently for cosmological data analysis problems (e.g. 
( e.g. Pietrobon et al., 2006; Fay et al., 2008; Delabrouille et al., 2009), 
where they apply to measurements of continuous random fields. An innova- 
tive aspect of the present work is to apply them in a detection context for 
point process data. The most distinctive property of the needlets is their 
simultaneous perfect localization in the spherical harmonic domain (actu- 
ally they are spherical polynomials, see Definition 1 below) and potentially 
excellent localization in the spatial domain. 

We recall briefly the construction due to Narcowich et al. (2006b), which 
is based on three fundamental steps: Littlewood-Paley decomposition (25), 
splitting (27) and discretization (29). Further details may be found in Baldi 
et al. (2008), and a discussion on non-tight frame construction in Guilloux 
et al. (2009). 

Let < a < 1, be a C°° non negative function defined on [0, oo). Take 
B > 1. We impose a to be identically 1 on [0, 1/B] and compactly supported 

on [0, 1]. The function b(x) =' a(x/B) — a{x) is then a very smooth function 
supported by [1/B,B]. 

Denote by H = L 2 (§, /z) the Hilbert space of square-summable functions 
on the sphere relatively to the Lebesgue measure \i with total mass An. Let 
{Yem}e£N,\m\<e be the usual spherical harmonics that form an orthonormal 
basis of Ti. Define Tie = span{Y^ m } m= _^ j ...^ and ILj the orthogonal projector 
on 7i£. We have 

VfeH,n i f=[f(y)L e (.,y)dy (24) 

with Li(x,y) =' Li(&cos(x ■ y)) and Li(») is the Legendre polynomial of 
order I normalized by the relation Lg{\) = (21 + 1)/(4tt). Define Ao(f) = 
li Is f( x ) d% and the sequence of linear operators 

imsart-aoas ver. 2011/11/15 file: densityCR.tex date: June 26, 2012 



42 G. FAY ET AL. 

Obviously, 



lim ||(A +^B,-)(/)-/|| 2 = (25) 



Denning 



and 



B^y)- Y, KB- j £)L £ (x,y) 



Dj{x,y):= Y, Vb(B- j £)L e (x,y) 

Bi- 1 <£<Bi+ 1 
we check, using (24) that 

V(x,y) £ S 2 , / Dj(x,u)Dj(u,y)du = £j(x,y) . (26) 

V/ E ft, £,(/) = [ B 3 (.,y)f(y)dy . (27) 

Let us now suppose, as important ingredient of the construction of the frame, 
that there is a positive quadrature formula for © i<; g2+j Hi, j G N which is 
valid on the sphere (see Narcowich et al., 2006a, b, for an existence result). 
This means that there exists a finite set Xj = {CjfcjfceA: °f §> an d for all £j&, 
there is an associated coefficient Xjk > 0, such that for all / € ® K g 2 +i Hi, 
we have the following interpolation formula: 



I f(y)dy= £> ife /(£,- fc )- 



As a consequence we rewrite (26) as 

B j( x ,y) = Y X Jk D j( x ,£,jk)Dj{£jk,y) (28) 

which will directly induce the definition of the needlets. From (27) and (28), 
we get 

Bj(f) = Y V^kDji; Ck) J f(y)^X~ k D j (y, £ jk ) dy (29) 

with as a consequence, the following Definition and Proposition. 
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Definition 1. We define the analysis needlets as follows: for k £ JCj 

ipjk(x) = y/\jkDj(x,€ jk ) = a/X^ ^2 Vb(B~ j £)L t (x,( jk ). 

Bi- 1 <£<&>+ 1 

Proposition. The set {^_i} U {ipj&j > 0, k e JCj} with V-i,o = 
(47r) -1 ' 2 and /C_i = {0} is a frame: for any f € T~L, 

\\f\\ 2 = £ K/><M 2 

j>-l,keKj 

f = A (f) + J2 B if = E </' ^fc^i* 

j j>-l,k£Kj 

A. 2. Numerical issues. 

A.2.1. Computation of needlet coefficients and related quantities. The 
direct computation of the needlet coefficients f3jk using expression (10) or 
7jfc using (31) is too lengthy at high scales (in the case of a relatively high 
number of observations n). The evaluations and simple or double sums is not 
prohibitive for the typical size of n (100-1000) that we use in our applications 
but the number of needlet coefficients grows as 0(B 2 ^) with the scale index 
j. However, with arbitrary precision, those calculations are advantageously 
performed in the multipole domain, using existing fast direct and inverse 
spherical harmonic transform associated to some pixelizations schemes. This 
restricts the choice of the pixelization. 

Namely, from the collection of points (Xi)i = i n and some tessellation of 
the sphere, one computes an integer valued map X which counts the number 
of events falling in each pixel, i.e. for each pixel p, 



** = !>*< 



i=i 

If the resolution of the pixelization is high enough, this discretization op- 
eration on incidence angles can be considered negligible when evaluating 
the overall performances of the statistical procedures. In practice, we take 
a discretization finer by far from the minimal requirement. We have used 
the HEALPix 4 package (Gorski et al., 2005), which provides a particular 
scheme of discretization of the sphere, the associated fast direct and inverse 
spherical harmonic transforms and utilities. It is very commonly used by cos- 
mic microwave background scientists. In the HEALPix framework, we took 



http : //healpix . jpl . nasa . gov 
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nside = 256 meaning that each sky map contains 12 x nside~ = 786432 
pixels. We denote those points by (Cp)pe-p- Then, we rely on a fast algorithm 
to compute the "multipoles" 

aim =' 2 , ^jkYtm(£,p)X(£, p ). 

per 

The coefficients (3jk are obtained by smoothing those multipoles by yb{2~H 
i.e. the filter shape at scale j, and evaluating the corresponding field at the 
cubature points of scale j namely 



hk = x )k 5Z ^( 2 ' 1 £) a tmYe m (£jk) ■ 



v l/2- 

tin 

The following diagram sums up those operations 



PQ)i=i,...,n -> map X p -> a^ m 4> \/&(2 ■ 7 ^)a^ m =4> x jk Pjk 
Double arrows denote as many transforms as the number of scales. On 

— 1/2 " 

each map of (rescaled) coefficient X- k /3j ; k, one can apply some thresh- 
olding. Then the same filtering operation (multiplication of the multipole 
moments by the filter yb(2~H) of the maps of coefficients leads to the map 
Ylk ^j,k^j,k- Summing those maps over the scales up to j = J results in 
the smooth estimate fj. For details on the practical use and design of the 
needlet transform of continuous random fields, see Guilloux et al. (2009). 

Remark. Those rather lengthy operations are necessary for accounting 
a thresholding procedure on the coefficients, as in the case of the PlugIn 
test, in order to obtain an estimate such as (15). In the Multiple approach 
however, this numerical implementation of the computation of the smooth 
and linear estimates fj defined in (11) can be obtained much more directly 
from the application of a smooth low-pass filter to the multipoles moments 
of the map X p . The transfer function of this low-pass filter is given by the 

Jj - J ' 
to zero smoothly between £ = 2 and £ = 2 +1 . 



function X^xjK^ J '^)> which is equal to one up to £ = 2 J and then decays 



As building-block function a, we have considered here a spline function 
of order 15 satisfying the above-mentioned conditions (see Figure 11). The 
corresponding filters £ i— > yb{2~H) are plotted in the multipole domain. 
Figure 12 represents the profiles 9 h-> J^ y/b(2~H)L£(cos9) which are the 
exact shapes of the needlets at each scale, up to the -\f\jk factor. It illustrates 
the increasing spatial concentration of the needlets as the scale j grows. 
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Fig 11: Upper panel: The spline function a (order 15) and the function b = 
a(-/B) — a(-). Here B = 2 so that a is in (0, 1) on [1/2, 1] and b is compactly 
supported by [1/2,2]. Bottom panel: The filter shapes associated to the five 
first needlets in multipole domain, namely I \— > vb(2~H),j = 1, . . . , 5 




e (in degrees) 



Fig 12: The shape of the five first needlets in the spatial domain as the func- 
tion of the co-latitude 9. Recall that all the tjjj^ functions are axisymmetric 
around the points £j ; j-- 
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In the PlugIn procedure, the needlet coefficients are thresholded using 
the data-driven rule (15). The computation of a 2 k and 5jk requires addi- 
tionally to convolve the map X with the functions ip 2 k . This operation is 

performed along the same lines as above, replacing the factors \b(2~H) by 
b' 4 f such that 

ip 2 k {x) = X jk Yl b 'j,e L e( x ,S,jk)- 

A. 2. 2. Choice of the needlets. Relations between the regularity of a and 
the asymptotic localization properties of the needlets is described in Nar- 
cowich et al. (2006b) (see also ?Guilloux et al., 2009). Here, because of the 
smoothness the exposure function g of the instrument, the localization of 
the needlet is not such a sensitive issue as in the case of analysis of a random 
field with missing data. We have chosen dyadic needlets (B = 2) and a being 
a spline function of order 15, which leads to simple but sufficiently concen- 
trated analysis wavelets. Those needlets are represented in Figure 11 and 12. 
Note however that asymptotic localization properties are theoretical prop- 
erties and that in our small sample studies, maximal scales involves angular 
frequencies smaller than £ = 128. This is not "high frequency" compared to 
usual analysis in CMB analysis for instance. 

A. 2. 3. Computation of LP distances. Let / be some estimate of / based 
on the coefficients fijj.. Whatever adaptive needlet method we use, we need 
to measure the discrepancy between the observations and the null hypothesis 
using LP norms with p = 1, 2, oo, say. An exact numerical evaluation of an LP 
distances between / and g requires some quadrature formula. Given a high 
resolution scheme and for any p > 1, we approximate LP norms and distances 
by estimating / and g on a regular grid (or pixelization) (£,k)k=i,...,N, and 
then by computing 

d P (f,g) d =d p (f,g) = \\f-g\\ p ^( £ \ k \f(Zk) - g(tkW) 1/p (30) 

fc=l,...,iV 

for some quadrature weights (A&). We have used above-mentioned Healpix 
weights at resolution nside=256. In some particular cases (p = 2 for in- 
stance), an exact evaluation is possible. Assume for example that both / 
and g are "band-limited" in the sense that they belong to spanjf/'jfc, j < 
J,k£ K,j}. Then, thanks to the properties of the needlets (tight frame, 
semi-orthogonality) 

11/- sill = E E(M/)-Ms)) 2 - 

j< j+i fee/Cj 
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However, with /3jk{f) = $jk (defined in Eq. 10), {Pjk{f) — (3jk(d)) 2 is a biased 
estimate of (/3jk(f) — Pjk(g)) 2 whereas an unbiased estimate is provided by 



Cjk 



=' 7 l _^ £(V»;*M - f3 jk {g))^ jk (X v ) - P 3k {g)). (31) 



Thus, ||/ — g\\\ may be estimated unbiasedly by 

d2*(f,g) = Y^ Y tjk- 



(32) 



Note that the Cjk s an d this estimate are not necessarily positive. More- 
over, this technique does not extend to non linear estimates of / that use 
thresholded coefficients. 



n = 100, method = thresh, model = H-1-expos-auger-delta-0.08-nbdeg-20 



n = 25, method = thresh, model = H-3-expos-unif-nsources-5a0-Emin-1e+19-Emax-1e+21-alpha-4.2-rmax-70 
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Fig 13: Effect of the choice of the norm on the sensitivity of the Multiple 
procedure, depending on the shape of the density under the alternative. Top: 
Unimodal density given by model (Hf) with 5 = 0.08, 9 = 20° (sample size 
n = 100). Bottom: Multimodal density given by model (Hf) with E m \ n = 
10 19 eV, n s = 500 (sample size n = 25). 
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